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INTRODUCTION 

10 

Technical Field 

The field of this invention is the expression 
of mammalian proteins. 

15 Background 

The discoveries of restriction enzymes, 
cloning, sequencing, reverse transcriptase, and 
monoclonal antibodies has resulted in extraordinary 
capabilities in isolating, identifying, and 
20 manipulating nucleic acid sequences. As a result of 
these capabilities, numerous genes and their 
transcriptional control elements have been identified 
and manipulated. The genes have been used for 
producing large amounts of a desired protein in 
25 heterologous hosts (bacterial and eulcaryotic host cell 
systems) . 

In many cases, the process of obtaining coding 
sequences and eliciting their expression has been a 
long and arduous one. The identification of the coding 
30 sequence, either cDNA or genomic DNA, has frequently 

involved the construction of libraries, identification 
of fragments of the open reading frame, examining the 
..^ flanking sequences, and the like. In mammalian genes 

'0, where introns are frequently encountered, in many 

•«¥ 35 instances, the coding region has been only a small 

fraction of the total nucleic acid associated with the 
gene. In other cases, pseudogenes or multi-membered 
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gene families have obscured the ability to isolate a 
particular gene of interest. Nevertheless, as 
techniques have improved, there has been a continuous 
parade of successful identifications and isolation of 
genes of interest. 

In many situations one is primarily interested 
in a source of the protein product. The cell type in 
the body which produces the protein is frequently an 
inadequate source, since the protein may be produced in 
low amounts, the protein may only be produced in a 
differentiated host cell which is only difficultly 
grown in culture, or the host cell, particularly a 
human cell, is not economic or efficient in a culture 
process for production of the product. There is, 
therefore, significant interest in developing 
alternative techniques for producing proteins of 
interest in culture with cells which provide for 
economic and efficient production of the desired 
protein and, when possible, appropriate processing of 
the protein product. 

Relevant Literature 

Mansour et al., Nature , 336:348-352, (1988), 
describe a general strategy for targeting mutations to 
25 non-selectable genes. Weidle et al . , Gene , 66:193-203, 
(1988), describe amplification of tissue-type 
plasminogen activator with a DHFR gene and loss of 
amplification in the absence of selective pressure. 
Murnane and Yezzi, Somatic Cell and Mole cular Genetics , 
30 14:273-286, (1988), describe transformation of a human 
cell line with an integrated selectable gene marker 
lacking a transcriptional promoter, with tandem 
duplication and amplification of the gene marker. 
Thomas and Capecchi, Cell, 51:503-512, (1987), describe 
35 site-directed mutagenesis by gene targeting in mouse 
embryo-derived stem cells. Song et al . , Proc. Natl. 
Acad. Sci. USA, 84:6820-6824, (1987), describe 
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homologous recombination in human cells by a two staged 
integration. Liskay et al. , "Homologous Recombination 
Between Repeated Chromosomal Sequences in Mouse Cells," 
Cold Spring Harbor, Symp. Quant. Biol . 49:183-189, 
5 (1984), describe integration of two different mutations 
of the same gene and homologous recombination between 
the mutant genes. Rubnitz and Subramani, Mol . and 
Cell. Biol . 4:2253-2258, (1984), describe the minimum 
amount of homology required for homologous 
10 recombination in mammalian cells. Kim and Smithies, 
Nucl. Acids. Res . 16:8887-8903, (1988), describe an 
assay for homologous recombination using the polymerase 
chain reaction. 

15 SUMMARY OF THE INVENTION 

Expression of mammalian proteins of interest 
is achieved by employing h omologous r e combinatio n for 
integration of an amplifiable" gene fj nd)other regulatory 
sequences in proximity to the gene of interest without 

20 interruption of the production of a proper 

transcript. The- region comprising the amplifiable gene 
and the gene of interest may be amplified, the genome 
fragmented and^^ecUy_orJUidirTO t? an 

expression host for expression of the target protein. 

25 If not previously amplified, the target region is then 
amplified, and the cell population screened for cells 
producing the target protein. Cells which produce the 
target protein at high and stable levels are expanded 
and used for expression of the target protein. 

30 

DESCRIPTION OF SPECIFIC EMBODIMENTS 
Methods and compositions are provided for 
production of mammalian proteins of interest in 
culture. The method employs homologous recombination 
35 in a host cell for integrating an amplifiable gene in 
the vicinity of a target gene, which target gene 
encodes the protein of interest. The region comprising 



both the amplifiable gene and target gene will be 
referred to as the amplifiable region. The resulting 
transformed primary cells may now be subjected to 
conditions which select for amplification, or the 

5 amplification may be performed subsequently. 

"Transform" includes transform, transfect, transduce, 
conjugation, fusion, elect roporation or any other 
technique for introducing DNA into a viable cell. The 
chromosomes or DNA of the transformed cells are then 
10 used to transfer the amplifiable region into the genome 
of secondary expression host cells, where the target 
region, if not previously amplified, is amplified. The 
resulting cell lines are screened for production of the 
target protein and secondary cell lines selected for 

15 desired levels of production, which- cells may be 

expanded and used for production of the desired protein 
in culture. 

The prim ary cell may be any mammalian cell of 
interest, particularly mammalian cells which do not 
20 grow readily in culture, more particularly primate i c 
cells, especially human cells, where the human cells 
may be normal cells or neoplastic cells, particularly 
normal cells. Various cell types may be employed as 
the primary cells, including fibroblasts, particularly 
25 diploid skin fibroblasts, lymphocytes, epithelial 

cells, neurons, endothelial cells, or other somatic 
cells, or germ cells. Of particular interest are skin 
fibroblasts, which can be readily propagated to provide 
for large numbers of normal cells. These cells may or 
30 may not be expressing the gene of interest. In those"! 
instances where the target gene is inducible or only : 
expressed in certain differentiated cells, one may j 
select cells in which the target gene is expressed, \ 
which may require immortalized cells capable of growth i 

35 in culture. 

A number of amplifiable genes exist, where by 
appropriate use of a selection agent, a gene integrated 
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in the genome will be amplified with adjacent flanking 
DNA. Amplifiable genes include dihydrof olate 
reductase, metallothionein-I and -II, preferably 
primate metallothionein genes, adenosine deaminase, 
5 ornithine decarboxylase, etc. The amplifiable gene 

will have transcriptional signals which are functional 
in the secondary or expression host and desirably be 
functional in the primary host, particularly where 
amplification is employed in the primary host or the 
10 amplifiable gene is used as a marker. 

The target genes may be any gene of interest, 
there already having been a large number of proteins of 
interest identified and isolated with continual 
additions to the list. Proteins of interest include 
15 cytokines, such as inter leukins 1-7; growth factors 

such as EGP, FGP, PDGF, and TGF; somatotropins; growth 
hormones; colony stimulating factors, such as G-, M-, 
and GM-CSF; erythropoietin; plasminogen activators, 
such as tissue and urine; enzymes, such as superoxide 
20 dismutase; interferons; T-cell receptors; surface 
membrane proteins; insulin; lipoproteins; 
^-antitrypsin; CD proteins, such as CD3, 4, 8, 19; 
clotting factors, e.g., Factor VIIIc and von 
Willebrands factor; anticlotting factors, such as 
25 Protein C; atrial naturetic factor, tumor necrosis 

factor; etc. 

For homologous recombination, constructs will 
be prepared where the amplif ia ble ge ne will be _flanked 
on one or both sides with DNA homologous with the DNA 

30 of the target region. The homologous DNA will 

generally be within 100 kb, usually 50 kb, preferably 
about 25 kb, of the transcribed region of the target 
gene, more preferably within 2 kb of the target gene. 
By gene is intended the coding region and those 

35 sequences required for transcription of a mature 

mRNA. The homologous DNA may include the 5 ' -upstream 
region comprising any enhancer sequences, 
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transcrip tional initiation sequen ces, the region 5' of 
these~sequences f or the like. The homologous region 
may include a portion of the coding region, where the 
coding region may be comprised only of an open reading 
frame or combination of exo ns and introns. The 
homologous region may comprise a ll or a p ortion of an 
intron^ where all or a portion of one or more exons may 
also bT"present. Alternatively, the homologous region 
may comprise the 3' -region, so as to comprise all or a 
portion of the transcription termination region, or the 
region 3' of this region. The homologous regions may 
extend over all or a portion of the target gene or be 
outside the target gene comprising all or a portion of 
the transcriptional regulatory regions and/or the 
15 structural gene. For the most part, the homologous 
sequence will be joined to the amplifiable gene, 
proximally or distally, usually a sequence o therj-han 
the wild-type sequence normally associated with^ the 
target gene"will^e used to separate the homologous 
20 sequence from the amplifiable gene on at least one side 
of the amplifiable gene. Some portion of the- sequence 
will be the 5' or 3' sequence associated with the 
amplifiable gene, as a result of the manipulations 
associated with the amplifiable gene. 
25 The homologous regions flanking the 

amplifiable gene need not be identical to the target 
region, where in vitro mutagenesis is desired. For ] 
example, one may wish to change the transcriptional / 
i*! initiation r^ioi^t_the^arge^5ene, so that a 

30 portion of the homologous region might comprise 

nucleotides different from the wild-type 5 '-region of 
the target gene. Alternatively, one could provide for 
inser t ion. of a transcr ipt ional initiation region 
differ ent from the wild-type initiation region between 
35 The wild-type initiation region and the structural 

gene, "similarly, one might wish to introduce various 
mutations into the structural gene, so that the 
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homologous region would comprise mismatches, resulting 

in a change in the encoded protein. For example, a 

signal leader sequence would be introduced in proper 

reading frame with the target gene to provide for 

secretion of the target protein expression product. 

Alternatively, one might change the 3* region, e.g., 

untranslated region, polyadenylation site, etc. of the 

target gene. Therefore, by homologous recombination, 

one can provide for maintaining the integrity of the 

target gene, so as to express the_wjj.d-.type protein 

under the transcriptional regulation of the wild-type 

promoter or one may provide for a change in 

transcriptional regulation, processing or sequence of 

the target gene. In some instances, one may wish to^] 

introduce an enhancer in relation to the 1 

/ | 

transcriptional initiation region, which can be i 

provided by, for example, integration of the 

amplif iable gene associated with the enhancer in a 

region upstream from the transcriptional initiation 

regulatory region or in an intron or even downstream 

from the target gene. 

In order to prepare the subject constructs, it 
will be necessary to know the sequence which is 
targeted for homologous recombination. While it is 
reported that a sequence of 14 bases complementary to a 
sequence in a genome may provide for homologous 
recombination, normally the individual flanking 
sequences will be at least about 150 bp, and may be 10 
kb or more, usually not more than about 5 kb. The size 
of the flanking regions will be determined by the size 
of the known sequence, the number of sequences in the 
genome which may have homology, to the site for 
integration, whether mutagenesis is involved and the 
extent of separation of the regions for mutagenesis, 
the particular site for integration, or the like. 

The integrating constructs may be prepared in 
accordance with conventional ways, where sequences may 
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be synthesized, isolated from natural sources, 
manipulated, cloned, ligated, subjected to in vitro 
mutagenesis, primer repair, or the like. At various 
stages, the joined sequences may be cloned, and 
5 analyzed by restriction analysis, sequencing, or the 
like. Usually the construct will be carried on a 
cloning vector comprising a replication system 
functional in a prokaryotic host, e.g., E. coli, and a 
marker for selection, e.g., biocide resistance, 
10 complementation to an auxotrophic host, etc. Other 
functional groups may also be present, such as 
polylinkers, for ease of introduction and excision of 
the construct or portions thereof, or the like. A 
large number of cloning vectors are available such as 
15 pBR322, the pOC series, etc. 

Once the construct is prepared, it may then be 
used for homologous recombination in the prim ary, cell 
target. Various techniques may be employed for 
integrating the construct into the genome of the 
20 primary cell without being joined to a replication 
system functional in the pr^ma^y ^° st * See ^ or 
example, a.S. Patent No as well as the 
references cited in the Relevant Literature section. 
Alternatively, the construct may be inserted into an 
25 appropriate vector, usually having a viral replication 
system, such as SV40, bovine papilloma virus, 
adenovirus, or the like, where the vector may also have 
a selectable marker for identifying transfected 
cells. Selectable markers include the neo gene, 
30 allowing for selection with G418, the herpes tk gene 
for selection with HAT medium, qpt gene with 
mycophenolic acid, complementation of an auxotrophic 
host, etc. 

The vector may or may not be capable of stable 
35 maintenance in the host. Where the vector is capable 
of stable maintenance, the cells will be screened for 
homologous integration of the vector into the genome of 




the host, where various techniques for curing the cells 
may be employed. Where the vector is not capable of 
stable maintenance, for example, where a temperature 
sensitive replication system is employed, one may 
5 change the temperature from the permissive temperature 
to the non-permissive temperature, so that the cells 
may be cured of the vector. In this case, only those 
cells having integration of the construct comprising 
the amplifiable gene and, when present, the selectable 
10 marker, will be able to survive selection. 

Where a selectable marker is present, one may 
select for the presence of the construct by means of 
the selectable marker. Where the selectable marker is 
not present, one may select for the presence of the 
15 construct by the amplifiable gene. For the neo gene or 
the herpes tk gene, one could employ a medium for 
growth of the transformants of about 0.1-1 mg/ml of 
G418 or HAT medium respectively. Where DHFR is the 
amplifiable gene, the selective medium may include from 
20 about 0.01-0.25 \M of methotrexate. 

In carrying out the homologous recombination, 
the DNA will be introduced into the p^majy?_cells^ 
Techniques which may be used include calcium phosphate/ 
DNA co-precipitates, microinjection of DNA into the 
25 nucleus, electroporation, bacterial protoplast fusion 
with intact cells, transf ection, polycations, e.g., 
polybrene, polyornithine, etc., or the like. The DNA 
may be. single or double stranded DNA. For various 
techniques for transforming mammalian cells, see Keown 
30 et al. , Methods in Enzymology (1989), in press, and 
Mansour et al., Nature , 336:348-352, (1988). 

Upstream and/or downstream from the target 
region construct may be a gene which provides for 
identification of whether a double crossover has 
35 occurred. For this purpose, the herpes simplex virus 
thymidine kinase gene may be employed since the 
presence of the thymidine kinase gene may be detected 



by the use of nucleoside analogs, such as acyclovir or 
gancyclovir, for their cytotoxic effects on cells that 
contain a functional HSV-tk gene. The absence of 
sensitivity to these nucleoside analogs indicates the 
absence of the thymidine kinase and, therefore, where 
homologous recombination has occurred, that a double 
crossover event has also occurred. 

The presence of the marker gene as evidenced 
by resistance to a biocide or growth in a medium which 
selects for the presence of the marker gene, 
establishes the presence and integration of the target 
construct into the host genome. No further selection 
need be made at this time, since the selection will be 
made in the secondary expression host, where expression 
of the amplified target gene may be. detected. If one 
wishes, one can determine whether homologous 
recombination has occurred by employing PCR and 
sequencing the resulting amplified DNA sequences. If 
desired, amplification may be performed at this time by 
stressing the primary cells with the appropriate 
amplifying reagent, so that multi-copies of the target 
gene are obtained. Alternatively, amplification may 
await transfer to the secondary cell expression host. 

High molecular weight DNA, greater than about 
20kb, preferably greater than about 50kb DNA or 
preferably metaphase chromosomes are prepared from the 
primary recipient cell strain having the appropriate 
integration of the amplification vector. Preparation 
and isolation techniques are described by Nelson and 
Housman, In Gene Transfer (ed. R. Kucherlapati) Plenum 
Press, 1986. The DNA may then be introduced in the 
same manner as described above into the secondary host 
expression cells, using the same or different 
techniques than employed for the primary cells. 
Various mammalian expression hosts are available and 
may be employed. These hosts include CHO cells, monkey 
kidney cells, C127 mouse fibroblasts, 3T3 mouse cells, 
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Vero cells, etc. Desirably the hosts will have a 
negative background for the amplifiable gene or a gene 
which is substantially less responsive to the 
amplifying agent. 

The transformed cells are grown in selective 
medium containing about 0.01-0.5 vM methotrexate and, 
where another marker is present, e.g., the neo gene, 
the medium may contain from about 0.1-1 mg/ral G418. 
The resistant colonies are isolated and may then be 
analyzed for the presence of the construct in 
juxtaposition to the target gene. This may be as a 
result of detection of expression of the target gene 
product, where there will normally be a negative 
background for the target gene product, use of PCR, 
Southern hybridization, or the like. 

The cells containing the construct are then 
expanded and subjected to selection and amplification 
with media containing progressively higher concen- 
trations of the amplifying reagent, for example, 
0.5-200 uM of methotrexate for the DHFR gene, and may 
be analyzed at each selection step for production of 
the target product. 

The various clones may then be screened for 
optimum stable production of the target product and 
these clones may then be expanded and used commercially 
for production in culture. In this manner, high yields 
of a product may be obtained, without the necessity of 
isolating the message and doing the various 
manipulations associated with genetic engineering or 
isolating the genomic gene, where very large genes can 
be a major research and development effort. 

The following examples are offered by way of 
illustration and not by way of limitation. 



12 

EXPERIMENTAL 

Cells 

Normal human diploid skin fibroblasts, 
5 ("primary recipient") are propagated in EEMEM medium 
supplemented with 20% fetal calf serum. Dihydrofolate 
reductase (DHFR) deficient Chinese hamster ovary (CHO) 
DUKX-B11 cells (Urlaub and Chasin, Proc. Natl. Acad. 
Sci. USA 77 -.4216-4220 (1980)) ("secondary recipient") 
10 are propagated in alpha-medium supplemented with 10% 
dialyzed fetal bovine serum. 

DNA Vector 

The amplification vector is constructed from 

15 pUC19 (Yanisch-Perron et al. , Gene 33:103-119 
(1985)). A 1.8 kb Haell fragment containing a 
hygromycin B phosphotransferase gene ( hph ) driven by 
the herpes simplex virus thymidine kinase (HSV tk) 
promoter is isolated from pHyg (Sugden et al. , Mol. 

20 Cell. Biol. 5:410-413 (1985)) by digestion with Haell 
and gel electrophoresis. Synthetic adaptors are added 
onto this fragment to convert the Hae ll ends into 
Hindlll ends and the resulting fragment is joined to 
pOC19 digested with Hindlll. The resulting plasmid 

25 pOCH contains the hygromycin cassette such that 

transcription of hph and beta-lactamase are in the 
opposite orientation. A 1.3 kb Sai l fragment 
containing a DHFR gene driven by SV40 transcriptional 
signals is isolated from pTND (Connors et al. , DNA 

30 7:651-661 (1988)) by digestion with Sai l and gel 

electrophoresis. This fragment is ligated to pOCH 
digested with Sail. The resulting plasmid pDCD 
contains the DHFR cassette such that DHFR and hph are 
transcribed in the same direction. A 1.76 kb BamHI 

35 fragment from the phage F15 (Friezner Degen et al., Ji 
Biol. Chem. 261:6972-6985 (1986)) which contains 1.45 
kb of DNA flanking the transcriptional start of human 
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tissue plasminogen activator (t-PA) in addition to the 
first exon and part of the first intron is isolated by 
gel electrophoresis after BamH I digestion . This 
fragment is joined to pUCD following digestion of the 
5 latter with BamH I. The resulting plasmid pUCG has the 
promoter of the t-PA fragment oriented opposite to that 
of the DHFR cassette. The t-PA fragment contains a 
single Ncol site, which is not unique to pOCG. A 
partial Nco l digest is carried out and a Not I linker is 

10 inserted. The resulting plasmid pCG contains a unique 
Not I site in the t-PA fragment which allows the plasmid 
to be linearized prior to t ians formation of the prima ry 
human diploid fibroblasts in order to^ increase the 
frequency of homologous recombination (Kucherlapati et 

15 al. . Proc. Natl I' Acad. Sci. PSA 81;3153-3157 (1984)). 

Preparation of Primary Recipients 

The plasmid pCG linearized. with Not I is 
introduced into the primary recipients by electro- 

20 poration employing DNA at lOnM. The resulting cells 
are then grown in selective medium (EEMEM with 200 
ug/ml hygromycin B). Resistant colonies are isolated 
and analyzed by PCR (Kim and Smithies, Nucleic Acids 
Res. 16:8887-8903 (1988)) using as primers the 

25 sequences GCGGCCTCGGCCTCTGCATA and CATCTCCCCTCTGGAGTGGA 
to distinguish homologous integrants from random 
ones. Amplification of cellular DNA by PCR using these 
two priiners yields a fragment of 1.9 kb only when DNA 
from correctly targeted cells is present. Cells 

30 comprising the DHFR gene integrated into the t-PA 
region are expanded and used as a source of genetic 
material for preparation of secondary recipients. 

Preparation of Secondary Recipients 
35 Metaphase chromosomes are prepared (Nelson et 

al. , J. Mol. Appl. Genet. 2:563-577 (1984)) from 
recipients demonstrating homologous recombination with 
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the DHFR and are then transformed in DHFR-def icient CHO 
cells by calcium phosphate mediated gene transfer 
(Nelson et al . , J. Mol. App l. Genet. 2:563-577 
(1984)). The cells are then grown in selective medium 

5 (alpha-medium containing 200 ug/ml hygromycin B) . 
Resistant colonies are isolated and analyzed for 
expression of human t-PA (Kaufman et al., Mol. Cell. 
Biol. 5:1750-1759 (1985)). The cell clones are then 
grown in selective medium containing progressively 
10 higher concentrations of methotrexate (.02-80 yM, with 
steps of 4-fold increases in concentration). After 
this amplification procedure, the cells are harvested 
and the human t-PA is analyzed employing an ELISA assay 
with a monoclonal antibody specific for t-PA (Weidle 

15 and Buckel, Gene 51:31-41 (1987)). Clones providing 
for high levels of expression of t-PA are stored for 

subsequent use. 

It is evident from the above results, that the 
subject method provides for a novel approach to 
20 expression of a wide variety of mammalian genes of 
interest. The method is simple, only requires the 
knowledge of a sequence of about 300 bp or more in the 
region of a target gene, and one may then use 
substantially conventional techniques for transferring 
25 the amplifiable region to an expression host, and 
production of the desired product in high yield. 

All publications and patent applications cited 
in thid specification are herein incorporated by 
reference as if each individual publication or patent 
30 application were specifically and individually 
indicated to be incorporated by reference. 

Although the foregoing invention has been 
described in some detail by way of illustration and 
example for purposes of clarity of understanding, it 
35 will be readily apparent to those of ordinary skill in 
the art in light of the teachings of this invention 
that certain changes and modifications may be made 
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thereto without departing from the spirit or scope of 
the appended claims. 
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